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CLAIMS 
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A computer processing apparatus for classifying a 
doci^ent , compris ing : 

m^ms for accessing a data base str u ctur e providing 
a plurality of different subject matter categories, the 
database containing a classified vocabulary consisting of 
terms in all cJf the different subject matter categories 
with each term being classified in accordance with the 
subject matter category structure of the database; 

means for receiving in computer-readable form a text 
document to be classified; 

processor means operable to compare terms appearing 
in the text document with Vhe terms in the classified 
vocabulary and to determine\ f rom the comparison the 
category for the document; and 

means for supplying a s\gnal carrying data 
representing the text document and oata associating the 
text document with the determined categpry. 

2. A compi!bter processing apparatus for checking 
spelling in a document , comprising: 

means for acce^siitg a database structure providing 
a plurality of differed subject matter categories, the 
database containing a qlas*&if ied vocabulary consisting of 
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teAns in all of the different subject matter categories 
with >each term being classified in accordance with the 
sub jectNxiatter category structure of the database; 

means\for receiving in computer-readable form a text 
document to Joe spell-checked; 

processoAmeans operable to compare terms appearing 
in the text document with the terms in the classified 
vocabulary, to determine from the comparison the category 
for the document, tb identify any term in the document 
not present in the classified vocabulary and to determine 
the term or terms in the^ classified vocabulary closest to 
an unidentified term and paving the same category as that 
determined for the document; and 

means for supplying a usjer with said determined term 
or terms . 



20 



25 



3. A computer processing ajApa&atus for refining the 
results of a subject matter search carried out by a 
search engine using a keyword, the apparatus comprising: 
means for accessing a database Iiaving a database 
structure providing a plurality of different subject 
matter categories, the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term beingVrlassif ied 
in accordance with the subject matter category structure 
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of^he database; 

tieans for receiving in computer-readable form 
documents forming the results of the subject matter 
search; 

processor\means operable to compare the keyword used 
to carry out the Search with the classified vocabulary to 
determine each category with which the keyword is 
associated; 

means for advisiHa a user of the different 
categories with which the\eyyord is associated; 

user-operable means se^cit^on means for enabling a 
user to select one of said iiMerent categories; 

means for comparing the te\ms used in the search 
result documents with the termss^ in the classified 
voc abu 1 ary ; and 

means for supplying the user \^_th information 
relating the search results to the selected category. 



20 



25 



Apparatus according to claim 1, wherein the 
prod^ssor means is operable to determine the category for 
the dcKSnment by determining from the comparison the 
category xxr categories of terms in the document, 
assigning weightings to the determined categories for the 
terms, and assigning the document being classified to the 
category having tfte highest weighting. 
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5. \Apparatus according to claim 2, wherein the 
processor means is operable to determine the category for 
the docum^Qt by determining from the comparison the 
category or b^tegories of terms in the document , 
assigning weightin^s/jLo the determined categories for the 
terms, and assigning tnte document being classified to the 
category having the higheVt weighting. 
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6. Apparatus according to claim 3, wherein the 
processor means is operable to determine the category for 
the document by deterithLning from the comparison the 
category or categories terms in the document, 

assigning weightings to the de^ttermined categories for the 
terms, and assigning the docuripn\ being classified to the 
category having the highest weighting. 

Apparatus according to claim 4, wherein the 
processor means is operable, for each term in the 
classified vocabulary and in the text document, to share 
a predetermined weighting factor between each category 
associated\with the term. 



8. Apparatus \ according to claim 1, wherein the 
accessing means \^s arranged to access a plurality of 
25 collocations also Vorming part of the database, each 
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collocation being associated with a specific different 
one of Nthe subject matter categories and each collocation 
consistif^g of a plurality of terms exemplifying the 
associated ^category. 

Apparatus for generating a database for storage on 
a c^s^piputer-readable medium, comprising: 
jans for storing terms; 
meafts for associating each term with one of a number 
of different subject matter categories; 

means for associating all terms falling within the 
same category\ with a common code identifying a 
collocation of tetms exemplifying that category so that 
terms in differed categories are associated with 
different codes identifying different collocations with 
each collocation bein^ specific to the associated 
category; and 

means for supplying as ^(^atabase each term together 
with the associated code. 

10. Apparatus according to claim\9, further comprising 
means storing said collocations. 
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11. Apparatus according to claim <\ wherein the 
supplying means is arranged also to\ supply the 
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collocati^p with the database. 

3S2 . A computer processing apparatus for classifying a 
document, comprising: 

i^eans for accessing a database having a database 
structure providing a plurality of different subject 
matter categories, the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter Categories with each term being classified 
in accordance with the subject matter category structure 
of the database and the database also containing a 
plurality of collocations each collocation being 
associated with a specific different one of the subject 
matter categories and eafeh collocation consisting of a 
plurality of terms exemplifying the associated category; 

means for receiving in computer-readable form a text 
document to be classified; \ 

processor means operable toNcompare terms appearing 
in the text document with the collocations to determine 
the collocation having the most termte in common with the 
document, and to allocate the categoryVof the determined 
collocation to the document; and \ 

means for supplying a signal parrying data 
representing the text document and data associating the 
text document with the determined category. \ 
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A computer processing apparatus for checking 
spelling in a document, comprising: 

s^eans for accessing a database having a database 
structuf^ providing a plurality of different subject 
matter categories, the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter ^categories with each term being classified 
in accordance wil3h the subject matter category structure 
of the database and the database also containing a 
plurality of collNocations each collocation being 
associated with a specific different one of the subject 
matter categories and d^ich collocation consisting of a 
plurality of terms exemplYfying the associated category; 



means for receiving in 
document to be spell-checke 
processor means operabl 
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pnputer-readable form a text 

compare terms appearing 
in the text document with the cd^Llocations to determine 
the collocation having most terms 3ui common with the text 
document, to select the category oAthat collocation as 
the category for the document, to identify any term in 
the document not present in the classified vocabulary and 
to determine the term or terms in Vhe classified 
vocabulary closest to an unidentified term land having the 
same category as that determined for the document; and 
means for advising a user of the determined term or 
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14 As, computer processing apparatus for refining the 
results )^f a subject matter search carried out by a 
search engine using a keyword, the apparatus comprising: 
means tog: accessing a database having a database 
structure providing a plurality of different subject 
matter categories^ the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordance with the\ subject matter category structure 
of the database and \he database also containing a 
plurality of collocations each collocation being 
associated with a specif ic\ different one of the subject 
matter categories and each collocation consisting of a 
plurality of terms exemplifyipA/the associated category; 

means for receiving fn ^computer-readable form 
documents forming the results ^f the subject matter 
search; 

processor means operable to compare the keyword used 
to carry out the search with the classified vocabulary to 
determine each category with which tt^ie keyword is 
associated; 

means for advising a user of th^v different 
categories with which the keyword is associated; 
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user-operable means selection means for enabling a 
user ^to select one of said different categories; 

mekns for accessing the collocation associated with 
the selected category; 

means tx>r comparing the terms used in the search 
result documents with the terms in the accessed 
collocation; an< 

means for Supplying the user with information 
relating the search^ results to the selected category. 

15. Apparatus according to claim 14, wherein the 
supplying means is arranged to supply the user with 
details of the search rdsblts having greater than a 
predetermined number of t€frm\in common with the accessed 
collocation . 



20 



16. Apparatus according to claim 9, wherein the 
processor means is operable to disambiguate between 
different meanings of terms by using the collocations. 



1>j£> 6^ 17\ Apparatus according to claim 12, wherein the 
processor means is operable to disambiguate between 
different meanings of terms by using the collocations. 
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18. Apparatus accorcftlftg to claim 14 , wherein the 
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processor i^ekns is operable to disambiguate between 
different meflarlings of terms by using the collocations. 

). Apparatus according to claim 7, wherein the 
5 accessing means is arranged to access the collocations 

f rom\ store means separate from the remainder of the 
da t abate e. 
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20. Apparatus according to claim 1, further comprising 
store means Configured to store the database. 



21. Apparatus Recording to claim 1, further comprising 
store means storing the database. 
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22. Apparatus according to claim 1, wherein the database 
structure provides s^id plurality of subject matter 
categories as a tree structure consisting of a plurality 
of main subject matter a^eas each divided into two or 
more subsidiary subject matVer areas. 

23. Apparatus according to claitoi 1, wherein the database 
structure provides said plurality of subject matter 
categories such that each category is defined by a 
subject matter area and a species on genus. 
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2<K Apparatus according to claim 23, wherein the 
database provides said plurality of subject matter 
categories such that the species or geni are people, 
places, organisations, products and technology. 
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25. Apparatus\ according to claim 23, wherein the 
database structure provides said plurality of subject 
matter categories \uch that the species or genus are the 
same for each subject, matter area. 

26. Apparatus according, to claim 1, wherein the database 
provides categories in each of the following subject 
matter areas: the universe, \ the earth, the environment, 
natural history, humanity, recreation, society, the mind 
and human history, 
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27. Apparatus according to claim l\ wherein the database 
structure is such that, for a givenXmeaning, a term is 
associated with only one category and different meanings 
of the same term are associated \with different 
categories . 
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28. Apparatus according to claim 1, wflerein the 
supplying means comprises means for storing \a signal 
supplied by the supplying means on a computer readable 
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medium . 



29. Apparatus according to claim 1, wherein the 
supplying means comprises means for forwarding a signal 
supplied N^y the supplying means to another processing 
apparatus . 
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30. Apparatus \ according to claim 1, wherein the 
supplying means \ comprises means for displaying the 
information to a uaer . 



15 



20 



25 



31. In a computer processing apparatus having means for 
accessing a database \having a database structure 
providing a plurality orf: different subject matter 
categories, the database\ containing a classified 
vocabulary consisting of terAp in all of the different 
subject matter categories with ^ach term being classified 
in accordance with the subject maVter category structure 
of the database and means for receiving in computer- 
readable form a text document to be classified, a method 
of classifying documents comprising: 

comparing terms appearing in the te^t document with 
the terms in the classified vocabulary; 

determining from the comparison the category for the 
document; and 
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su&plying a signal carrying data representing the 
text document and data associating the text document with 
the determined category. 
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3\* In a computer processing apparatus having means for 
accessing a database having a database structure 
provio^ng a plurality of different subject matter 
categories, the database containing a classified 
vocabularA consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordance \ith the subject matter category structure 
of the databaseXand means for receiving in computer- 
readable form a t^xt document to be spell-checked , a 
method of checking spelling in a document comprising: 

comparing terms appearing in the text document with 
the terms in the classifieds vocabulary; 

determining from the (caparison the category for the 
document ; 

identifying any term in th^ document not present in 
the classified vocabulary; 

determining the term or termK in the classified 
vocabulary closest to an unidentified term and having the 
same category as the document; and 

advising a user of the determined teim or terms. 
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33S. In computer processing apparatus having means for 
accessing a database having a database structure 
providing a plurality of different subject matter 
categories, the database containing a classified 
vocabulary Consisting of terms in all of the different 
subject matters^categories with each term being classified 
in accordance wi/fch the subject matter category structure 
of the database and means for receiving in computer- 
readable form documents forming the results of the 
subject matter searchX a method of refining the results 
of a subject matter searcth carried out by a search engine 
using a keyword, the method comprising: 

comparing the keyword Vsed to carry out the search 
with the classified vocabularAto determine each category 
which the keyword is associatec 

advising a user of the di 
which the keyword is associated); 

identifying the one of said categories selected by 
a user using user-operable selection means; 

comparing the terms used in th^ search result 
documents with the terms in the classified vocabulary; 
and \ 

supplying the user with information relating the 
search results to the selected category. \ 



srent categories with 
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3\. A method according to claim 31 , comprising 
determining the category for the document by determining 
from \he comparison the category or categories of the 
terms the document, assigning weightings to the 

determineck categories for the terms, and assigning the 
document beYng classified to the category having the 
highest weighting. 

35. A method according to claim 34 , which comprises 
assigning weightings by, for each term in the classified 
vocabulary and im the text document, sharing a 
predetermined weighting factor between each category 
associated with the term. 

36. A method according to\claim 31 which also comprises 
accessing a plurality of coMocations also forming part 
of the database, each collocation being associated with 
a specific different one of the suibject matter categories 
and each collocation consisting qf a plurality of terms 
exemplifying the associated category. 

37. In a computer apparatus having data storage means, 
a method of generating^ aA database for storage on a 
computer readable medium, fcwbmprising: 

storing terms ; | \ 
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associating each term with one of a number of 
different subject matter categories; 

associating all terms falling within the same 
category w:D^h a common code identifying a collocation of 
terms exemplifying that category so that terms in 
different categories are associated with different codes 
identifying different collocations with each collocation 
being specific to tne associated category; and 

supplying as a da\a|^ase each term together with the 
associated code. 



38. A method according to Vrlaim 37, which comprises 
supplying the collocations witrK the database. 
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In a computer processing apparatus having means for 
accessing a database having a database structure 
providing a plurality of different subject matter 
categoridq, the database containing a classified 
vocabulary ^pnsisting of terms in all of the different 
subject matterV:ategories with each term being classified 
in accordance wiftji the subject matter category structure 
of the database ebtid the database also containing a 
plurality of collocations each collocation being 
associated with a specific different one of the subject 
matter categories and eac\ collocation consisting of a 
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plurality of terms exemplifying the associated category 
and having means for receiving in computer-readable form 
a text document to be classified, a method of classifying 
documents comprising : 
5 comparing terms appearing in the text document with 

the collocations to determine the collocation having the 
most terms in common with the document; 

allocating the category of the determined 
collocation to the document; and 
10 supplying a signal carrying data representing the 

text document and data associating the text document with 
the determined category. 

40. In\ computer processing apparatus having means for 
15 accessing database having a database structure 

providing a plurality of different subject matter 
categories , the \. database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter category Structure of the database, and the 
20 database also containing a\lurality of collocations each 



collocation being associated 
one of the subject matter cate 
consisting of a plurality o 
associated category and having mearife for receiving in 
25 computer-readable form a text document to be spell- 



r?\th a specific different 
s and each collocation 
p teVms exemplifying the 
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checked, a method of checking spelling in a document 
comprising: 

Mnparing terms appearing in the text document with 
the collocations to determine the collocation having most 
terms in common with the text document; 

selecting the category of that collocation as the 
category for tfte document; 

identif yingNany term in the document not present in 
the classified vocabulary; 

determining thev term or terms in the classified 
vocabulary closest to an unidentified term and having the 
same category as that selected for the document; and 
advising a user of tKe Ndetermined term or terms, 
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41 . In a computer processimg\apparatus having means for 
accessing a database havingV a database structure 
providing a plurality of different subject matter 
categories , the database containing a classified 
vocabulary consisting of terms in a\l of the different 
subject matter categories with each term being classified 
in accordance with the subject matter category structure 
of the database and the database also \ containing a 
plurality of collocations each collocation being 
associated with a specific different one of the subject 
matter categories and each collocation consisting of a 
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plurality of terms exemplifying the associated category , 
andxhaving means for receiving in computer-readable form 
documents forming the results of a subject matter search 
carried oait by a search engine using a keyword, a method 
of refiningr the search results, comprising: 

compariW the keyword used to carry out the search 
with the classified vocabulary to determine each category 
with which the keyword is associated; 

advising a user of the different categories with 
which the keyword isVassociated; 

determining which\pf said categories is selected by 
a user using user-operable means selection means; 

accessing the collofcaAion associated with the 
selected category; XV 

comparing the terms use& in the search result 
documents with the terms in the accessed collocation; and 

supplying the user with information relating the 
search results to the selected category. 

42. A method according to claim 41, wlrich comprises 
supplying the user with details of the search results 
having greater than a predetermined number ok terms in 
common with the accessed collocation. \ 



43^. A method according to claim 36, which comprises 
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accessing the collocations from store means separate from 
the remainder of the database. 

A method according to claim 37 f which comprises 
stribcturing the database to provide said plurality of 
subject matter categories as a tree structure consisting 
of a plurality of main subject matter areas each divided 
into two oSr more subsidiary subject matter areas. 
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45. A method \ficcording to claim 37 , which comprises 
structuring the database to provide said plurality of 
subject matter ca-fcegories such that each category is 
defined by a subject \atter area and a species or genus. 

46. A method according NfcK claim 45, which comprises 
structuring the database} t<\ provide said plurality of 
subject matter categories sucrk that the species or geni 
are people, places, organisations, products and 
technology. 

47. A method according to claim 45,\which comprises 
structuring the database structure to\provide said 
plurality of subject matter categories su\h that the 
species or genus are the same for each subjd^t matter 
area. 
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48 \ A method according to claim 37 , which comprises 
structuring the database to provide categories in each of 
the following subject matter areas: the universe , the 
earth, they environment, natural history, humanity, 
recreation, sdciety, the mind and human history. 



J 10 



49. A method accotdung to claim 37, which comprises 



structuring the datab 



a term is associated 



such that, for a given meaning, 
itftvonly one category and different 
meanings of the same term >v^e associated with different 
categories . 
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bp . A method according to claim 31, which comprises 
cafsrying out the supplying by storing a signal on a 
computer-readable medium. 
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51. A method according to claim 31, which comprises 
carrying out the supplying by forwarding a signal to 
another processing apparatus. . . 

52. A method according to claim 31, which comprises 
displaying the information to a user. 
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53. A database for use With an apparatus in accordance 
with claim 1, the database having a database structure 
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prcWiding a plurality of different subject matter 
categories, the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordanceVith the subject matter category structure 
of the database. 
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54. \A database for use with an apparatus in accordance 
with c\aim 2, the database having a database structure 
providing* a plurality of different subject matter 
categories\ the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matterNcategories with each term being classified 
in accordance wi^i the subject matter category structure 
of the database. 
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55. A database for usk>with an apparatus in accordance 
with claim 3, the database having a database structure 
providing a plurality ok different subject-matter 
categories, the database Xcontaining a classified 
vocabulary consisting of terms\ in all of the different 
subject matter categories with eaah term being classified 
in accordance with the subject matter category structure 
of the database . 
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A database for use with an apparatus in accordance 
withNclaim 12, the database having a database structure 
providing a plurality of different subject matter 
categories ,\ the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordance with -&he subject matter category structure 
of the database and\ the database also containing a 
plurality of collocations each collocation being 
10 associated with a specif i& different one of the subject 

matter categories and each Ncollocation consisting of a 
plurality of terms exemplifying the associated category. 
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57. A database for use with an apparatus in accordance 
with claim rS, the database having a database structure 
providing a Vlurality of different subject matter 
categories, thev database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordance with the ^ubjy^ct matter category structure 
of the database and theJ database also containing a 
plurality of collocati<ink\ each collocation being 
associated with a specifi^ different one of the subject 
matter categories and each collVcation consisting of a 
plurality of terms exemplifying that associated category. 
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58\ A database for use with an apparatus in accordance 
with\2laim 14, the database having a database structure 
providing a plurality of different subject matter 
categories\ the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordance with, the subject matter category structure 
of the database a\d the database also containing a 
plurality of colloVations each collocation being 
associated with a specific different one of the subject 
matter categories and ea\h collocation consisting of a 
plurality of terms exemplifying the associated category. 
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59. A database according to\Y:laim 55, wherein the 
database structure provides ■ saik plurality of subject 
matter categories as a tree structure consisting of a 
plurality of main subject matter areas each divided into 
two or more subsidiary subject matter \areas . 

60. A database according to claim 55 \ wherein the 
database structure provides said plurality of subject 
matter categories such that each category is\defined by 
a subject matter area and a species or genus, 



25 



87 [5315650] 
6%. A database according to claim 60 , wherein the 
database provides said plurality of subject matter 
categories such that the species or geni are people, 
places ^organisations f products and technology. 
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62. A database according to claim 60, wherein the 
database structure provides said plurality of subject 
matter categories such that the species or genus are the 
same for each sub\ect matter area. 

63. A database according to claim 59, wherein the 
database provides categories in each of the following 
subject matter areas: t 



environment , natural hi 



universe, the earth, the 
, humanity, recreation, 



society, the mind and hum^n history. 
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64. A database according to claim 59, wherein the 
database structure is such that, foX a given meaning, a 
term is associated with only one category and different 
meanings of the same term are associated^ with different 
categories . 




Apparatus for classifying electronic documents, 
sing : 

storage means storing a classification scheme having 
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plurality of collocations each collocation being 
associated with a respective different subject matter 
areaxand containing a set of terms which exemplify that 
sub j ec tv matter area ; 

meanfc for comparing terms used in a document to be 
classified Vith the terms in said collocations; 

means fo\ allocating the document being classified 
to the one of \aid collocations which said comparing 
means identifies ^s having the most number of terms in 
common with the document being classified; 

means for associating with the document being 
classified a code representing the subject matter area of 
the allocation collocation^ and 

means for storing the apcument together with the 
associated code. 



20 



25 



66. Apparatus for filtering electronically stored 
documents fortaing the results of a search carried out by 
a search engine on the basis of a keyword supplied to the 
search engine by a\ser, comprising: 

means storing a classification scheme divided into 
a number of collocations ep.qh associated with a specific 
different one of a numbefr 9£ different subject matter 
areas , each collocation containing a set of terms which 
exemplify the associated sub ject Boatter area; 



89 [5315650] 
\ means storing a vocabulary or dictionary of words 
wit\ each word in the vocabulary being associated with 
one o\ more of said collocations , a description of the 
subject \area of each associated collocation and a 
respectiv^v dif ferent definition of the word for each 
associated collocation; 

means fofy determining from the vocabulary storing 
means each coISlocation with which the keyword is 
associated; \ 

a user interface for providing the user with the 
subject area descriptions of each collocation with which 
the keyword is associated and for requesting the user to 
select one of said collocations; and 

means responsive to the selection of a collocation 
by the user for comparing jkMte terms contained in the 
selected collocation with fc^tttts used in each of the 
documents identified by \tjhe Search engine and for 
providing the user with only th\se of said documents 
having more than a predetermined Viumber of terms in 
common with the selected collocation\ 

67. A data carrier carrying a first setVf data divided 
into a number of collocations each associated with a 
specific different one of a number of different subject 
matter areas with each collocation containing a set of 
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terfcs which exemplify the associated subject matter area, 
and a\ second set of data comprising a vocabulary or 
dictionaW of terms with each entry in the vocabulary 
being associated with a respective different code 
associating i^with a specific one of said collocations 
for each different context or meaning of the entry. 
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68. Apparatus for Storing data on a computer-readable 
storage medium, comprising: 

means for storing ileitis of data; 

means for associating \each item of data with one of 
a number of subject matter aVeas such that each item of 
data belongs to at least one subject matter area; 

means for associating each Item of data with one of 
a number of different species araate or genera so that 
each item of data is associated wi|>fi\Apnly one genus; and 

means for directly or indirdptlyXwriting each item 
of data together with information Vdentifying the 
associated subject matter area and genus ojito a computer 
readable storage medium. 



25 



69. Apparatus for processing computer usable data, 
comprising: 

means for storing items of data; 

means for associating each item of data wit?h at 



p 
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least ohe of a number of different subject matter areas; 

means\for associating each item of data with only 
one of a numlS&j: of species areas or genera; and 



means for ge 1 
data together with 



tating a signal carrying each item of 
information identifying the associated 



10 



subject matter area and\genus. 

70S. A signal carrying processor implementable 
instructions for causing apparatus to become configured 
to fork apparatus in accordance with claim 1. 

71. signal carrying processor implementable 
instructions for causing apparatus to become configured 
to form apparatus in accordance with claim 2. 

72. A sicijJ^M carrying processor implementable 
instructions! fo\ causing apparatus to become configured 
to form apparatus\in accordance with claim 3 . 

signal carrying a database in accordance with 
claim 53 or a plurality of collocations for use with the 
database. 



15 



25 



74. A sYorage medium carrying a database in accordance 
with clain\53 or a plurality of collocations for use with 
the database 



30 



75. A processor readable medium storing processor 
readable instructions for causing a processor to: 

access a database having a database structure 
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prodding a plurality of different subject matter 
categories, the database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordance with the subject matter category structure 
of the database and means for receiving in computer- 
readable forifc^ a text document to be classified; 

compare t\erms appearing in the text document with 
the terms in thfe classified vocabulary; 

determine fj^pm the comparison the category for the 
document; and 

supply a signal carrying data representing the text 
document and data associating the text document with the 
determined category, 

76. ffv processor readable medium storing processor 
readable \instructions for causing a processor to: 

access a database having a database structure 
providing a \plurality of different subject matter 
categories, thfci database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories, the database containing a 
classified vocabulary consisting of terms in all of the 



different subject matter 
classified in accordance wi 
structure of the database; 

receive a text documentl to 



gories with each term being 
he subject matter category 

\be spell-checked; 
compare terms appearing in £he text document with 
the terms in the classified vocabulary; 

determine from the comparison thfci category for the 
document; 
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identify any term in the document not present in the 
classified vocabulary; and 

aqvise a user of the determined term or terms. 



10 



15 



20 



77. A pressor readable medium storing processor 
readable instructions for causing a processor to: 

access database having a database structure 

providing a plurality of different subject matter 
categories, the\ database containing a classified 
vocabulary consisting of terms in all of the different 
subject matter categories with each term being classified 
in accordance with th^ subject matter category structure 
of the database; 

receive documents farming the results of the subject 
matter search; 

compare the keyword iked to carry out the search 
with the classified vocabulary to determine each category 
which the keyword is associat 

advise a user of the diff 
of the subject matter categories* 
consisting of a plurality of 
associated category. 



t categories with one 
and each collocation 
tdfcftis exemplifying the 



25 
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78. A processor readable medium Storing processor 
readable instructions for causing a pro\essor to: 
store terms ; 

associate each term with one of V number of 
different subject matter categories; 

associate all terms falling within the sanfe category 
with a common code identifying a collocation \of terms 
exemplifying that category so that terms in different 



# 
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categories 
identify 
being specify 
supply as 



are associated with different codes 
erent collocations with each collocation 
the associated category; and 
database each term together with the 



associated code. 



